source("birdsLib.R")
##
## Attaching package: 'monitoR'
## The following object is masked from 'package:tuneR':
##
## readMP3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
We have massive sound records and need to find if there are starling calls
There are a data set of starling calls 1000+ records
We think the data in hands reasonably cover the real conditions
we develop specific application for starlings
we consider mel frequency cepstrum coeffisients as a feature and HTK and KERAS as ML tools
After playing the data the following numbers were chosen as MFCC’s parameters.
Happy to explaine the choice.
hopt
## [1] 0.025
wint
## [1] 0.075
getMel
## function (x)
## {
## require(tuneR)
## tmp <- melfcc(x, minfreq = 3000, maxfreq = 10000, fbtype = "mel",
## numcep = 12, hopt = hopt, wint = wint)
## return(tmp)
## }
Simple detector to cut birds calls from lengthy sound records would be good to have
I called it “MFCC power”
The function is as below
getMCPow
## function (x, excl = 1)
## {
## colSums(t(abs(x[, -(excl)])))
## }
sigDir="../"
fn="DecreasingCallLargeTmp.wav"
w<-readWave(paste0(sigDir, fn), from = 0, to = 15, units = "sec")
plot(w)
viewSpec(w)
ss<-getMel(w)
tt<-(1:nrow(ss)) * hopt
matplot(tt, ss, type="l", main="MFCCs, all")
pov=getMCPow(ss)
plot(pov, type="b", main="MFFC Power")
showMCPow(pov)
title(main="Median filter")
win.range<-c(0.4, 0.8)
I use this file DecreasingCallLargeTmp.wav to produce the estimate for target window length
The target window size set to 0.4, 0.8 sec
wins<-readWave(paste0(sigDir, fn)) %>%
getMel %>%
getMCPow %>%
getAWins
ll<-with(wins, end-start)
plot(sort(ll))
abline(h=win.range, lty=2, col="red")
Now worrking on the file MU000_20091021_172000.wav (the first helf hour only here to avoid memory problems)
From 0 to 30, in minutes
We seek time windows with enough action in them
It means when the median filter output is greater than the threshold
Threshold used by default (see function “getAWins”) is the top quartile of “pov”
## be careful -- remove the tmp file when changing the input wav
tmp.fn="action.wins.RData"
#file.remove(tmp.fn)
if (file.exists(tmp.fn)) load(tmp.fn) else {
action.wins <- (ppov<- paste0(sigDir, lfn) %>%
readWave(from=0, to = 30, units="min") %>%
mono %>%
getMel() %>%
getMCPow()
) %>%
getAWins()
save(ppov, action.wins, file=tmp.fn) }
ll<-with(action.wins, end-start)
plot(sort(ll), log="y")
abline(h=win.range, col="red", lty=2)
ii<-which(inrange(ll, win.range))
table(inrange(ll, win.range))
##
## FALSE TRUE
## 740 292
sum(ll[ii])
## [1] 161.875
ofn="wins.wav"
winsWav(infile=ifn, outfile=ofn, ww=action.wins[ii,])
## Warning in .local(left, ...): 'samp.rate' not specified, assuming 44100Hz
## Warning in .local(left, ...): 'bit' not specified, assuming 16bit
writeSelectionTable(wins=action.wins, ofn="actionWins.txt")
sigDir<-"/Volumes/VladBackup/Birds/StarlingCalls/"
lfn<-"ma5ho-ehcz5.L.wav"
wL<-mono(readWave(paste0(sigDir, lfn)))
plot(wL, main=lfn)
viewSpec(wL, main=lfn)
ss=getMel(wL)
pov=getMCPow(ss)
showMCPow(pov, tit=lfn)
xll=c(0,300)
op<-par(mfcol=c(3,1), mar=c(0,1,0,1))
viewSpec(wL, start.time = xll[1], page.length = xll[2])
showMCPow(pov, tit=lfn, xl = xll)
plot(wL, xlim=xll)